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AMENDMENTS TO THE CLAIMS: 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 

LISTING OF CLAIMS: 

1 . (Previously Presented) A method of displaying files within a file system 
to a user in a semantic hierarchy, the method comprising the steps of: 

mapping the files into a semantic vector space; 

clustering the files within said space, wherein multiple threshold values that 
are settable to desired levels of granularity are defined, and said files are clustered 
based on said multiple threshold values; 

deriving a hierarchy of plural levels of clusters from said clustering; and 
displaying the files in a hierarchical format of plural levels of clusters based on 
said derived hierarchy. 

2. (Original) The method according to claim 1 , wherein the step of 
clustering the files is performed as a background routine during the operation of a 
computer associated with said file system. 

3. (Original) The method according to claim 2, wherein the step of 
clustering the files is performed in response to the creation of a new file within the file 
system. 



Attorney's Docket No. P2989-908 
Application No. 10/644,815 

Page 3 

4. (Original) The method according to claim 1 , wherein said files are text 
documents and said mapping is conducted on the basis of a language model. 

5. (Original) The method according to claim 4, wherein said mapping step 
comprises the steps of constructing a matrix which associates each word in the 
documents with a vector and associates each document with a vector. 

6. (Original) The method of claim 5, further including the step of 
decomposing said matrix to define the words and documents as vectors in a 
continuous vector space. 

7. (Original) The method of claim 5, wherein said clustering is performed 
by identifying documents whose vectors are within a threshold distance of one 
another. 

8. (Canceled) 

9. (Previously Presented) The method of claim 5 further including the step 
of automatically labeling the clusters based on the resulting clusters. 

10. (Original) The method of claim 9 wherein said labeling comprises 
selecting representative words based on the closeness of their vectors to the 
xfuuument vectors in a cluster 
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1 1 . (Previously Presented) A computer-readable medium containing a 
graphical user interface configured to display files in a virtual file system with a 
semantic hierarchy of plural levels of clusters that is derived from semantic 
similarities of said files, clustering said files based on multiple threshold values that 
are settable to desired levels of granularity, and determining a directory structure 
having plural levels of clusters based on the clustering determined from similarities 
between said files, wherein the graphical user interface graphically presents the 
determined directory structure having plural levels of clusters to be displayed on a 
display device. 

12. (Canceled) 

13. (Previously Presented) The graphical user interface according to claim 
1 1 , wherein clustering of the files is initiated by user selection. 

14. (Previously Presented) The graphical user interface according to claim 
1 1 , wherein clustering of the files is initiated upon creation of a new file in the file 
system. 

15. (Previously Presented) The graphical user interface according to claim 
11, wherein text files are clustered utilizing a language model and non-text files are 
clustered utilizing rule-based techniques. 
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16. (Original) The graphical user interface according to claim 15, wherein 
said language model comprises the LSA paradigm. 

17. (Previously Presented) Computer readable media having stored 
therein computer executable code for analyzing files in a file system to determine 
similarities in data pertaining to their content, clustering said files based on multiple 
threshold values that are settable to desired levels of granularity, determining a 
directory structure having plural levels of clusters based on the clustering determined 
from similarities between the files, and displaying files in hierarchical format of plural 
levels of clusters based on the clustering determined from similarities between the 
files. 

18. (Original) The computer-readable media of claim 17 wherein said files 
are text documents, and the similarities are based upon the word content of the files. 

19. (Original) The computer-readable media of claim 18 wherein said 
similarities are determined in accordance with a language model, and the files are 
clustered in accordance with said model. 

20. (Original) The computer-readable media of claim 19, wherein said 
language model comprises the LSA paradigm. 

2T. (f^eviously^Presented)~The computeRreadable mectta~o1^ta7rn~t§; 
wherein said computer-executable code performs the steps of constructing a matrix 
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which associates each word in the documents with a vector and associates each 
document with a vector. 

22. (Original) The computer-readable media of claim 21 , wherein said 
computer-executable code further performs step of decomposing said matrix to 
define the words and documents as vectors in a continuous vector space. 

23. (Original) The computer-readable media of claim 22, wherein said 
computer-executable code performs clustering by identifying documents whose 
vectors are within a threshold distance of one another. 

24. (Canceled) 

25. (Previously Presented) The computer-readable media of claim 19, 
wherein said computer-executable code performs step of automatically labeling the 
clusters based on the resulting clusters. 

26. (Original) The computer-readable media of claim 25, wherein said 
labeling comprises selecting representative words based on the closeness of their 
vectors to the document vectors in a cluster. 



27. (Previously Presented) The computer readable media according to 
claim 17, wherein the computer executable code performs the foltowfrig-steps: 
clustering text files within the file system using semantic similarities; 
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clustering non-text files within the files system using rule-based techniques; 
labeling the resulting clusters; and 

displaying the files in a hierarchical format based on the resulting clusters and 

labels. 

28. (Previously Presented) A computer system, comprising: 
a file system storing files; 

a display device; 

a processor for analyzing the content of files stored in said file system to map 
said files into a semantic vector space, cluster the files within said space based on 
multiple threshold values that are settable to desired levels of granularity, and derive 
a hierarchy of plural levels of clusters from said clustering; and 

a user interface which displays representations of files stored in said file 
system in the form of said derived hierarchy of plural levels of clusters. 

29. (Canceled) 

30. (Previously Presented) The computer system of claim 28, wherein said 
files are text documents and said processor maps said files on the basis of a 
language model. 

31 . (Original) The computer system of claim 30 wherein said processor 
constructs a matrix which associates~each word in the documents with a vector and 
associates each document with a vector. 
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32. (Original) The computer system of claim 31 wherein said processor 
further decomposes said matrix to define the words and documents as vectors in a 
continuous vector space. 

33. (Original) The computer system of claim 31 , wherein said processor 
clusters the files by identifying documents whose vectors are within a threshold 
distance of one another. 

34. (Canceled). 

35. (Previously Presented) The computer system of claim 31 , wherein said 
processor automatically labels the clusters based on the resulting clusters. 

36. (Original) The computer system of claim 35 wherein said processor 
labels the clusters by selecting representative words based on the closeness of their 
vectors to the document vectors in a cluster. 

37. (Previously Presented) The method according to claim 1, wherein said 
deriving step includes organizing the clusters into a hierarchical directory structure. 

38. (Previously Presented) A method of organizing a plurality of 
documents, comprising: 
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mapping all words of the plurality of documents and the plurality of documents 
in a semantic vector space; 

generating a plurality of clusters based on the semantic similarities of the 
plurality of documents and multiple threshold values that are settable to desired 
levels of granularity; 

organizing the plurality of clusters into directories in a hierarchical format of 
plural levels of clusters; and 

displaying the plurality of documents in said hierarchical format of plural levels 
of clusters based on a result of clustering the plurality of documents. 

39. - 47. (Canceled) 

48. (New) The method of claim 1 , wherein the multiple threshold values 
are characteristic values of clusters from said clustering. 

49. (New) The method of claim 48, wherein the characteristic values of the 
clusters are cluster variances of the clusters. 

50. (New) The graphical user interface according to claim 1 1 , wherein the 
multiple threshold values are characteristic values of clusters from said clustering. 

51 . (New) The graphical user interface according to claim 50, wherein the 
characteristic values of the clusters are cluster variances of the clusters. 
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52. (New) The computer-readable media of claim 17, wherein the multiple 
threshold values are characteristic values of clusters from said clustering. 

53. (New) The computer-readable media of claim 52, wherein the 
characteristic values of the clusters are cluster variances of the clusters. 

54. (New) The computer system of claim 28, wherein the multiple threshold 
values are characteristic values of clusters from said clustering. 

55. (New) The computer system of claim 54, wherein the characteristic 
values of the clusters are cluster variances of the clusters. 

56. (New) The method of claim 38, wherein the multiple threshold values 
are characteristic values of clusters from said clustering. 

57. (New) The computer of claim 56, wherein the characteristic values of 
the clusters are cluster variances of the clusters. 



